{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 1: Regression in Keras\n",
    "In this exercise, you will learn how to effectively use [Keras](https://keras.io/) to train neural network models. You will learn how to make your own Keras models, evaluate their performance, and see the different ways that models can overfit (and how to prevent them). Elements of this exercise have been adapted from the Keras tutorials for Keras.\n",
    "\n",
    "This notebook contains many sections that are filled out for you and many that you will need to fill out to complete the exercise (marked in <font color='red'>RED</font>). You are finished when \"Restarting and Run All Cells\" executes the entire notebook without producing any errors. Do not remove assert statements.\n",
    "\n",
    "**HINT**: An answer to each of the blank cells is hidden after each . Double click the cell to see the answer in its contents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from matplotlib import pyplot as plt\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.metrics import mean_absolute_error\n",
    "from tensorflow.keras import optimizers as opt\n",
    "from tensorflow.keras import Sequential\n",
    "from tensorflow.keras import layers\n",
    "from tensorflow import keras\n",
    "from time import perf_counter\n",
    "from math import isclose\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import warnings\n",
    "np.random.seed(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Data\n",
    "We are going to use the [Auto MPG Dataset](https://archive.ics.uci.edu/ml/datasets/auto+mpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First step is to download and cache it locally"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_path = keras.utils.get_file(\"auto-mpg.data\", \"http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we read in the data using Pandas. As the dataset lacks headers, we have to make them ourselves"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(dataset_path) as fp:\n",
    "    for i in range(5):\n",
    "        print(fp.readline().strip())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight',\n",
    "                'Acceleration', 'Model Year', 'Origin']\n",
    "data = pd.read_csv(dataset_path, names=column_names,\n",
    "                   na_values = \"?\", comment='\\t',\n",
    "                   sep=\" \", skipinitialspace=True)\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data has some missing values, and we'll just remove them"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data.dropna(inplace=True)\n",
    "print(f'Total number of entries: {len(data)}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is far from \"big data\", but will help us learn Keras with minimal waiting around for computations to finish"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Preprocessing\n",
    "There are a few things we need to deal with about the data first to make it more accessible to machine learning.\n",
    "\n",
    "The first is the \"Origin\" column, which is a categorical variable expressed as a list of numbers. \n",
    "Categorial variables expressed this way cause problems with machine learning because the ordering of numbers suggest cardinality that does not exist\n",
    "(i.e., Origin \"1\" is not less than Origin \"2\").\n",
    "We will use [one-hot encoding](https://en.wikipedia.org/wiki/One-hot) to remove this false ordering."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_origin = pd.get_dummies(data['Origin'], prefix='Origin')\n",
    "new_origin.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Replace the origin column with these new columns\n",
    "\n",
    "*Pro Tip*: [Pandas](https://pandas.pydata.org/) makes working with tabular data in Python very easy. If you are about to write a loop to operate on tabular data, check to see if Pandas already has a function for what you're trying to do. Pandas makes your code much faster and easier to read/maintain. Start with the excellent [10 Minutes to Pandas](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = pd.concat([data, new_origin], axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data.drop('Origin', axis=1, inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now are going to mark which columns are inputs and outputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_col = 'MPG'\n",
    "X_cols = data.columns[1:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Output is \"MPG\" and input is anything except MPG"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another trick needed for training a Neural Network is that the input values should all be the same order of magnitude. \n",
    "As shown above, our data is not. So, we will normalize the datasets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data[X_cols] = data[X_cols].apply(lambda x: (x - x.mean()) / x.std())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data.head(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our data is now ready for training.\n",
    "A best practice for evaluating neural networks is to split data into a training and validation sets.\n",
    "When comparing different machine learning models, you should test them against data that was neither used to train the model nor select the hyperparameters of the model - the \"validation set\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_data, valid_data = train_test_split(data, test_size=0.1)\n",
    "print(f'Training data has {len(train_data)} entries')\n",
    "print(f'Validation data has {len(valid_data)} entries')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quick Tutorial: Working with Keras\n",
    "Building Keras models is decomposed into three different phases: defining the architecture, setting the optimizer, and training the model. We will go through each step independentally."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Defining an Architecture\n",
    "Keras provides a wide variety of layers that can be combined together to form complicated networks.\n",
    "We are only going to focus on the [Sequential](https://keras.io/getting-started/sequential-model-guide/) type of model for simplicity, \n",
    "but you will get to make complicated models with branches in the Generative models session later today.\n",
    "\n",
    "The Sequential model takes a list of layers to generate combine in to a single model.\n",
    "The first element must specify the shape of the input data, which is the number of features in this case.\n",
    "The last element defines the shape of the outputs and must allow for the outputs to output on the entire range of the data.\n",
    "`linear` is a good choice for regression problems"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = Sequential([\n",
    "    layers.Dense(64, activation='relu', input_shape=(len(X_cols),)),\n",
    "    layers.Dense(64, activation='relu'),\n",
    "    layers.Dense(1, activation='linear')\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Taking a look at the network we created"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f'Shape of the input: {model.input_shape}')\n",
    "print(f'Shape of the output: {model.output_shape}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It has an input shape of $N \\times 9$, where $N$ can be any number of training entries, and $N \\times 1$ outputs (i.e., just MPG)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model has 5 layers: an input layer (not shown), 2 hidden layers, and an output layer. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Concept Review! How do we get 4865 parameters?\n",
    "Remember our simple neural network from the lecture:\n",
    "\n",
    "<img width=320px src=\"./img/simple-mlp.png\"/>\n",
    "\n",
    "_First Hidden Layer_: Each input node is connected to each node in the first hidden layer (i.e., the network is \"fully connected\"): $([\\text{9 Inputs}] + [\\text{1 Bias per Hidden Node}]) \\times [\\text{64 Hidden Nodes}] = 640\\text{ parameters}$\n",
    "\n",
    "_Second Hidden Layer_: Each node in the first hidden layer is connected to each node in the second hidden layer: $([\\text{64 hidden nodes}] + [\\text{1 Bias per Hidden Node}]) \\times [\\text{64 Hidden Nodes}] = 4160\\text{ parameters}$\n",
    "\n",
    "_Output Layer_: Each second hidden layer node is connected to the output node and that output node has a bias term: $[\\text{64 hidden nodes}] + [\\text{1 Bias}] = 65\\text{ parameters}$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training a Model\n",
    "The first step to train a model is to \"compile\" it.\n",
    "Compiling defines the optimizer and the loss function for the network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile('adam', loss='mean_squared_error')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We compile with the Adam optimizer and using a mean squared error loss.\n",
    "Note that you can define more settings for these optimizers by creating the optimizer as an object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "adam = keras.optimizers.Adam(lr=1e-3)\n",
    "model.compile(adam, loss=keras.metrics.mean_squared_error)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model is ready to be trained once you compile it.\n",
    "\n",
    "The `fit` operation, like its analogue in scikit-learn, takes the inputs and outputs for the training data as arguments.\n",
    "It also has many other options specific to neural networks, such as the \"batch size\" (see later module) and a \"validation split.\" \n",
    "\n",
    "Validation sets are a very important tool in Neural Networks as they help determine if the model is overfitting during the training process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history = model.fit(train_data[X_cols], train_data[y_col], batch_size=32, validation_split=0.1, epochs=8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `history` object returned by the fit function contains details about the training process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.plot(history.epoch, history.history['loss'], label='Training Loss')\n",
    "ax.plot(history.epoch, history.history['val_loss'], label='Validation Loss', linestyle='--')\n",
    "\n",
    "fig.set_size_inches(3.5, 2.5)\n",
    "ax.set_xlabel('Epoch')\n",
    "ax.set_ylabel('Loss ($MPG^2$)')\n",
    "ax.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The loss for both the training data and the validation are both decreasing with epoch, which means our model is training correctly!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Running the Model\n",
    "Now that our model is trained, we can use it to predict the properties of the validation set we held out with the predict option."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pred_y = model.predict(valid_data[X_cols])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.scatter(valid_data[y_col], pred_y)\n",
    "\n",
    "ax.set_xlim(0, max(valid_data[y_col].max(), pred_y.max()) + 5)\n",
    "ax.set_ylim(ax.get_xlim())\n",
    "\n",
    "ax.plot(ax.get_xlim(), ax.get_xlim(), 'k--')\n",
    "\n",
    "fig.set_size_inches(3.5, 3.5)\n",
    "ax.set_xlabel('Actual MPG')\n",
    "ax.set_ylabel('Predicted MPG')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Admittedly, the fitness is not very good. But, we only trained for 8 epochs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1: Overfitting a Neural Network\n",
    "A key problem with neural networks is that the more you train them, the more likely they are to become overfit.\n",
    "In this part, we need you to create a neural network and overfit it.\n",
    "\n",
    "<font color='red'>Step 1: Make a neural network with 3 hidden layers of 128 elements per hidden layer each and ReLU activations.</font>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color='white'>\n",
    "model = Sequential([\n",
    "    layers.Dense(128, activation='relu', input_shape=(len(X_cols),)),\n",
    "    layers.Dense(128, activation='relu'),\n",
    "    layers.Dense(128, activation='relu'),\n",
    "    layers.Dense(1, activation='relu')\n",
    "])\n",
    "</font>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert model.count_params() == 34433"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color='red'>Fit the model with a batch size of 32 for 1024 epochs with a hold out set of 10%. Use Adam with the default settings and mean squared error loss. Save the history as a variable named `history`</font>\n",
    "\n",
    "*Pro Tip*: Keras produces a lot of status messages with its default `verbose` setting of the `fit` function. Consider using `verbose=0` to turn it off if that bothers you."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color=\"white\">\n",
    "model.compile('adam', 'mean_squared_error')\n",
    "history = model.fit(train_data[X_cols], train_data[y_col], batch_size=32, validation_split=0.1, verbose=0, epochs=1024)\n",
    "</font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the loss as a function of the number of epochs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.semilogy(history.epoch, history.history['loss'], label='Training Loss')\n",
    "ax.plot(history.epoch, history.history['val_loss'], label='Validation Loss', linestyle='--')\n",
    "\n",
    "fig.set_size_inches(3.5, 2.5)\n",
    "ax.set_xlabel('Epoch')\n",
    "ax.set_ylabel('Loss ($MPG^2$)')\n",
    "ax.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note how the validation loss stops decreasing around the 50th epoch and then increases. \n",
    "This is overfitting.\n",
    "The network is continuing to get better at predicting the training data at the expense of being generalizable to other data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pred_y = model.predict(valid_data[X_cols])\n",
    "overfit_score = mean_absolute_error(valid_data[y_col], pred_y)\n",
    "print(f'Mean absolute error on held-out set: {overfit_score : 0.2f} MPG')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.scatter(valid_data[y_col], pred_y)\n",
    "\n",
    "ax.set_xlim(0, max(valid_data[y_col].max(), pred_y.max()) + 5)\n",
    "ax.set_ylim(ax.get_xlim())\n",
    "\n",
    "ax.plot(ax.get_xlim(), ax.get_xlim(), 'k--')\n",
    "\n",
    "fig.set_size_inches(3.5, 3.5)\n",
    "ax.set_xlabel('Actual MPG')\n",
    "ax.set_ylabel('Predicted MPG')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model does predict better than when we trained with only 8 epochs, but we can do better"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## <font color='red'>Part 2: Preventing Overfitting with Early Stopping</font>\n",
    "Early stopping is the idea that you detect when the error on the validation set is increasing and roll back to the network state with the best valiation error."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Quick Tutorial: Pre-trained Models\n",
    "If you were to call the `fit` method again, the Keras model will resume training from its previous state.\n",
    "This is a nice feature if you need to restart a training from a saved checkpoint or using the weights from another model for pretraining (which we will dicuss in a later tutorial).\n",
    "However, it interferes with the lesson on early stopping we want here.\n",
    "\n",
    "So, we are going to make a new model before training.\n",
    "To make this more convenient, we are going to make a function that generates a Keras model so to avoid having to write out the architecture each time.\n",
    "\n",
    "*Pro Tip*: Making these \"model building functions\" can save you a lot of time when testing new architectures."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def build_model(n_layers=3, hidden_size=128):\n",
    "    model = Sequential([layers.Dense(hidden_size, activation='relu', input_shape=(len(X_cols),))])\n",
    "    \n",
    "    for i in range(n_layers-1):\n",
    "        model.add(layers.Dense(hidden_size, activation='relu'))\n",
    "        \n",
    "    model.add(layers.Dense(1, activation='linear'))\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = build_model()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile('adam', 'mean_squared_error')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating the Callback\n",
    "Keras implement early stopping using a \"callback.\"\n",
    "Callback functions by performing an operation (e.g., assessing validation loss) at different points of the model training.\n",
    "In our case, we will use the [`EarlyStopping`](https://keras.io/callbacks/#earlystopping) callback, which is one of many [available in Keras](https://keras.io/callbacks/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "callbacks = [keras.callbacks.EarlyStopping(restore_best_weights=True, patience=25)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The callback we defined will stop only if the validation loss does not improve after 25 epochs and will restore the state of the model at the point with lowest loss. We use the callbacks when calling the fit operation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color='red'>Train the model using the same batch size, but now add the early-stopping callback. Store the result in `history`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color='white'>history = model.fit(train_data[X_cols], train_data[y_col], batch_size=32, validation_split=0.1, verbose=0, epochs=256, callbacks=callbacks)</font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following cells will plot your results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.semilogy(history.epoch, history.history['loss'], label='Training Loss')\n",
    "ax.plot(history.epoch, history.history['val_loss'], label='Validation Loss', linestyle='--')\n",
    "\n",
    "fig.set_size_inches(3.5, 2.5)\n",
    "ax.set_xlabel('Epoch')\n",
    "ax.set_ylabel('Loss ($MPG^2$)')\n",
    "ax.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model now stops training at around 60 epochs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert len(history.epoch) < 256  # Model should stop training before 256 epochs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pred_y = model.predict(valid_data[X_cols])\n",
    "earlystop_score = mean_absolute_error(valid_data[y_col], pred_y)\n",
    "print(f'Mean absolute error on held-out set: {earlystop_score : 0.2f} MPG')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert earlystop_score < overfit_score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.scatter(valid_data[y_col], pred_y)\n",
    "\n",
    "ax.set_xlim(0, max(valid_data[y_col].max(), pred_y.max()) + 5)\n",
    "ax.set_ylim(ax.get_xlim())\n",
    "\n",
    "ax.plot(ax.get_xlim(), ax.get_xlim(), 'k--')\n",
    "\n",
    "fig.set_size_inches(3.5, 3.5)\n",
    "ax.set_xlabel('Actual MPG')\n",
    "ax.set_ylabel('Predicted MPG')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The performance should be better than when you ran the network to its full number of epochs.\n",
    "\n",
    "Early stopping is one of the many approaches to preventing overfitting. Some other techniques include:\n",
    "\n",
    "- Reduce the model complexity\n",
    "- Employ [Dropout](https://keras.io/layers/core/#dropout) layers, which randomly drop connections in the neural network during training. \n",
    "- Using [regularization](https://keras.io/regularizers/), which ensures the weights do not become too large"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
